Union Join
The Union node is used to create a union between two or more tables, adding the union of those tables to the database schema. The Union node can be connected to any Select node (excluding Multi Select), Preparation, and Column Operation nodes. The Union function combines rows (vertically) from multiple tables, appending the rows in the second table to the first. Each table should have the same number of columns in it.
The Union node can be used to combine rows from tables from the data source, or to combine data source tables with new tables generated by data cleansing and preparation functions. For example, both the Date Range and Summarize functions generate new tables; these can then be joined to the original table from the data source.
A typical use case for a union is if you have two (or more) lists, consisting of the same column structure but different values, and you want to combine those lists into one. For example, you may have two lists of sales people and the number of items they sold from two different stores. You can combine the rows from both tables using the Union function to produce one master list.
Create a Union or Stack Join (vertical)
-
Connect the Union node to the tables to be combined in the order in which they should be appended to the new table.
The rows of each table are appended in the order in which the tables are connected to the Union node. The given tables should have the same number of columns, and the same data type.
-
With the Union node selected, go to the Properties panel to configure the union:
-
Resulting Table Name: Name the new table.
-
Union All: Clear this checkbox to eliminate any duplicate rows.
-
Add Source Table Name Column: Select this checkbox to add a column that lists the name of the source column for each row.
-

In this example, we have three Excel files from three different stores: 'Store A', 'Store B', and 'Store C'. Each spreadsheet contains a table called 'Transactions', which has the following columns: Sales Person ID, Number of Transactions, Items of Items Sold. Note that the Sales Person IDs '5' and '9' appear in 2 lists (for 'Store A' and 'Store C'):
After importing our spreadsheets into Pyramid, we want to create a new table that will combine all rows in each of the three Transactions tables into one table, along with a new column listing the original source table name for each row. This new table should look like this:
To achieve this, we need to connect each of the three tables to a Union node in the Data Flow.
Select the Union node on the canvas (green arrow below) and from the Properties panel (green highlight below) select:
- Union All because we know that there are duplicates in the Sales Person ID column, where sales people worked in multiple stores.
- Add Source Table Name Column so that we can see which store each sales person worked in.
The resulting table (seen in the Preview panel, blue highlight) combines all rows in each of the three lists:
Related information
Common Properties
There are a number of fields that are present in the Properties panel when you have any of the preceding nodes selected on the canvas. These fields include Result Properties, Column Selection, Set Variable Values, and Metadata.
- Click here for more details about the Common Properties